Early and Late Combinations of Criteria for Reranking Distributional Thesauri
نویسنده
چکیده
In this article, we first propose to exploit a new criterion for improving distributional thesauri. Following a bootstrapping perspective, we select relations between the terms of similar nominal compounds for building in an unsupervised way the training set of a classifier performing the reranking of a thesaurus. Then, we evaluate several ways to combine thesauri reranked according to different criteria and show that exploiting the complementary information brought by these criteria leads to significant improvements.
منابع مشابه
Unsupervised selection of semantic relations for improving a distributional thesaurus (Sélection non supervisée de relations sémantiques pour améliorer un thésaurus distributionnel) [in French]
Unsupervised selection of semantic relations for improving a distributional thesaurus Work about distributional thesauri has shown that the relations in these thesauri are mainly reliable for high frequency words. In this article, we propose a method for improving such a thesaurus through its re-balancing in favor of low frequency words. This method is based on a bootstrapping mechanism : a set...
متن کاملRéordonnancer des thésaurus distributionnels en combinant différents critères
In this article, we propose a method for improving distributional thesauri based on a bootstrapping mechanism: a set of positive and negative examples of semantically similar words are selected in an unsupervised way and used for training a supervised classifier. This classifier is then applied for reranking the semantic neighbors of the thesaurus used for example selection. We show how the rel...
متن کاملComparing Similarity Measures for Distributional Thesauri
Distributional thesauri have been applied for a variety of tasks involving semantic relatedness. In this paper, we investigate the impact of three parameters: similarity measures, frequency thresholds and association scores. We focus on the robustness and stability of the resulting thesauri, measuring inter-thesaurus agreement when testing different parameter values. The results obtained show t...
متن کاملNothing like Good Old Frequency: Studying Context Filters for Distributional Thesauri
Much attention has been given to the impact of informativeness and similarity measures on distributional thesauri. We investigate the effects of context filters on thesaurus quality and propose the use of cooccurrence frequency as a simple and inexpensive criterion. For evaluation, we measure thesaurus agreement with WordNet and performance in answering TOEFL-like questions. Results illustrate ...
متن کاملDiscovering Distributional Thesauri Semantic Relations
The paper presents technique and analysis to discover distributional thesauri relations by using statistical similarity of different word’s contexts. The application uses educational electronic text corpus and the Sketch Engine software statistical search to extract and compare word’s collocations from the related text corpus. The semantic search used is based on the evaluation and comparison o...
متن کامل